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Abstract 

This paper precisely analyzes the wire density and required area in standard layout styles for the 
hypercube. The most natural, regular layout of a hypercube of N 2 nodes in the plane, in a N x N grid 
arrangement, uses [2N/3\ + 1 horizontal wiring tracks for each row of nodes. (The number of tracks per 
row can be reduced by 1 with a less regular design.) This paper also gives a simple formula for the wire 
density at any cut position and a full characterization of all places where the wire density is maximized 
(which does not occur at the bisection). 

Keywords: interconnection networks, hypercube, wire density, VLSI layout area, mincut linear arrangement, 
optimal linear arrangement, channel routing 

1 Introduction 

The (binary) hypercube network has been widely considered as a network for parallel computing, but its 
VLSI layout requires a great deal of wiring area. Studies of communications capabilities of the hypercube 
versus other networks (e.g., 0, [|, ||, 0) have varied the width of links between nodes in order to equalize 
the hardware costs of the networks being compared under various cost measures, some of which are closely 
related to VLSI layout area. 

Recall that the interconnection pattern for a hypercube of iV 2 nodes can be specified by numbering the 
nodes from to N 2 — 1 and requiring a link between any two nodes whose numbers expressed in binary differ 
in exactly one bit. When the numbers differ in the ith bit from the right, we refer to the link between the 
nodes as a dimension i link. (Though the links between nodes are generally considered to be bidirectional, 
we count them as one wire for simplicity. Results quoted in this paper must be multiplied by 2 to obtain 
exact correspondence with results given in Dally |jj or Ranade and Johnsson juj.) 

The network cost measure used by Dally @] is bisection width (the minimum number of wires that must 
be cut to divide the set of nodes into two equal halves with no connections between them). This measure 
may be justified by Thompson's lower bound [|l5|, ffBI indicating that area is at least 1/4 of the square of 
the bisection width. Thompson's bound, however, does not give a precise correspondence between bisection 
width and area. Furthermore, as Dally notes, the maximum wire density (number of wires that must cross 
a outline) does not occur at the bisection in the "normal layout" of the hypercube (nodes placed as in 
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Figure 1: The normal hypercube layout and a naive track assignment for N 2 = 64. 



Figure |l|). (Note that each row and column of the layout is itself a hypercube, so we can focus henceforth 
on the layout of an iV-node hypercube in a single row.) 

Ranade and Johnsson fl4| consider the actual area required for the normal layout by bounding the number 
of horizontal tracks per row required to lay out the interconnections (following the common approach of 
placing vertical wires in one chip layer and horizontal wires in another). (The situation involving vertical 
tracks is completely analogous to that involving horizontal tracks.) They focus, however, on optimality to 
within an unspecified constant factor and only upper bound the number of tracks per row as N — 1, as 
obtained by the assignment of wires to tracks illustrated in Figure [l]. 

A more sophisticated track assignment by Chen, Agrawal, and Burke || (with a different ordering of the 
nodes), yields N — lg N tracks per rowj^] 

A still better measure for the number of tracks per row, utilized in || 0|, is [2^V/3J. That this number 
represents the congestion for the natural embedding of the hypercube into a square grid also follows from 
an independent statement of Nakano |L3| and an argument of Bezrukov et al. || . 

This paper gives a short alternative proof of the congestion result that also yields a concise formula for 
the wire density at every cut position and a full characterization of all positions where density is maximized. 
The analysis is then extended to account for the exact placement of the terminals and wires in the layout. 
It would be desirable to make all nodes identical, e.g., by placing the connections of each node in order of 
dimension (as in Figure [l]) ; this would be particularly convenient when implementing the common form of 
hypercube algorithm referred to as a "normal algorithm" (e.g., see [[H]), in which only one dimension of 
communication links is used at any step, and the dimensions are used consecutively. Uniformity of nodes 
is also helpful for assembling the system and for replacing defective nodes. We show that such a uniform 
approach incurs a penalty of exactly one track per row in the VLSI layout, whereas full freedom to permute 
the terminals allows a layout with [2N/3\ tracks per row. 

The rest of this paper is organized as follows. Section introduces notation and provides background 
regarding the congestion result. Section ^ gives a simple formula for the wire density at each intercolumn 

We use lgx for log 2 x, and we assume N is a power of 2. 
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position and a full characterization of those positions where the density is maximized. Then the analysis 
is extended to include the density at cutlines that run through nodes, which completes the analysis of the 
number of wiring tracks required. Section ^, comments on hypercube layouts in which the nodes are placed 
differently than in the normal scheme illustrated in Figure 



2 Background 

As a first step towards determining the usage of wiring tracks in the normal hypercube layout, we focus 
on the intercolumn wire density per row. We give here a short proof that the maximum intercolumn wire 
density per row in the normal hypercube layout is [2-/V/3J and that the leftmost intercolumn position where 
this maximum is realized is position \_(N + 1)/3J. In the process we introduce notation for our main results 
in the next section and note important symmetry properties. 

We define f(i,k) to be the number of dimension k links (i.e., links spanning 2 fc ~ 1 columns) that cross 
intercolumn position i in the normal layout. Using to denote the position to the left of all the nodes, it 
is easy to see that the pattern for /(0, k), /(l, k), . . . , f(N - 1, k) is 0, 1, 2, ... , 2 fc ~ 1 - 1, 2 fe " 1 , 2 fe ~ 1 - 1, 
2 fc_1 — 2, . . . , 1, and repeat as necessary; we may express this as 



/(*,*) = * 1-2 



i - 1 



d 2 mod 2 k 



(1) 



Then we define S(i,N) to be the total number of connections crossing intercolumn position i in the normal 
layout, i.e., 



Ik AT 



S{i,N) = ^2f(i,k) 



(2) 



fc=i 



For the proof in this section, there is also a more convenient mathematical expression for the maximum 
intercolumn wire density and and the leftmost position where the maximum is realized: 



m(N) = (AN- (-l) lgAr -3)/6 
p(N) = (N-(-l)^ N )/3 



(3) 
(4) 



Then the result discussed in this section is that maxo<;<Ar S(i, N) = m(N) and that i — p(N) is the 
least i at which the maximum is achieved. The result follows from the following Lemma and two Theorems: 

Lemma 1 S(i, N) = S(N -i,N) for < i < N. 



Proof The result follows from showing f(i, k) = f(N — i, k) for < i < N and 1 < k < IgN, which follows 
from Equation E| 



f(N-i,k) 



(N-i)[l 



H) 1 



N-i-1 



2 fc-i 
1 



mod 2 mod 2 



2 A ' 



fc-i 



mod 2 mod 2 



since TV is a multiple of 2 

i - 1 



(<) 1 



mod 2 mod 2 k 



since the floor switches parity unless i = — i = 2 k 1 
= f(i,k). m 



(mod 2 fe ) 
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Theorem 2 S{p{N),N) = m(N). 

Proof The proof is by induction. The statement is trivial for N = 1. The induction hypothesis is that 
S(p(x),x) — mix) for all x that are even powers of 2 less than N and some N > 2. From this hypothesis, 
we proceed to show that S(p(N), N) = m(N): 

S(p(N),N) = f(p(N),\gN) + S(p(N),N/2) by Equation | 

= f(p(N), lg iV) + S(p(N/2),N/2) by Lemma and Equation| 

= p(N) + m(N/2) by Equations ^ and [l] and the induction hypothesis 

= m(N) by Equations || and|J . ■ 

Now we need only that S(i,N) < m(N), but the following theorem includes additional information to 
make the proof easier: 

Theorem 3 S(i, N) < min{m(N), m(N) - (p(N) ~ i)} for < i < N. 

Proof We again use induction and show that the statement follows under the assumption that it holds for 
smaller values of N. 
We note first that 

S(i,N) = f(i,lgN) + S(i,N/2) 

< i + m(N/2) by Equation 1 and the induction hypothesis 

= m(N) - (p(N) - i) by Equations | and | (5) 

All that remains is to show that S(i, N) < m(N), which we split into three cases according to the value 
of i: 

Case I: i > N/2. Since S(i, N) = S(N — i, N) by Lemma [j], it suffices to consider cases II and III. 
Case II: i < p(N) . The result follows from Inequality [|. 
Case III: p(N) < i < N/2 . We have 

S(i,N) = f(i,lgN) + S(i,N/2) 

= i + S{yjN/2 - i, N/2) by Equation |l| and Lemma | 

< i + m(N/2) - {p{N/2) - {^N/2 - i)) by the induction hypothesis 

= m(N) by utilizing Equations || and |J ■ 



3 Number of wiring tracks 

Though we know the maximum intercolumn wire density per row in the layout of Figure 0, we still need to 
determine the number of horizontal wiring tracks required to route the wires. Fortunately, an early channel 
routing algorithm of Hashimoto and Stevens jn| , the left-edge algorithm, guarantees that the density and 
number of tracks are equal, since we have no vertical constraints (e.g., see fl2]| ). To obtain a layout using 
exactly m(N) tracks, however, we must be free to permute the locations of connections on each hypercube 
node so that the density (maximum number of wires crossing a vertical line) is no higher when the cutlinc 
runs through nodes than when it runs between nodes. A layout using m(N) = 5 tracks for one row of the 
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Figure 2: Wiring a row in m(N) = 5 tracks for N 







4 



6 7 



Figure 3: Wiring a row requires m(N) + 1 = 6 tracks for N — 8 when the wires leaving each node are in 
order of increasing dimension. 



64-node hypercube is illustrated in Figure |. (Thisfi gure uses a track assignment slightly different than the 
assignment produced by the left edge algorithm in order to reduce the number of wire crossings.) 

If we require that each node has its connections in order of dimensions 1, 2, . . . lgjV, we cannot achieve 
a routing in m(N) tracks when N > 2; Figure || with 6 tracks shows the best layout of a row when N = 8. 
Even with this fixed order of connections, however, the density (and therefore the number of tracks) is just 
m(N) + 1 for N > 2. Our approach to obtaining this stronger result also produces a characterization of all 
locations where the density is maximized. We encapsulate these results in the following two Theorems. 

Theorem 4 The values ofi in binary for which S(i,N) is maximized are those obtained as follows. Starting 
from the leftmost bit ofi and moving right, choose pairs of bits to be 01 or 10 except that when \gN is even, 
the last pair may be 11. When \gN is odd, the 1 remaining bit is set to 1. 

PROOF. Considering the number i represented in binary, define b(i,j) to be the bit in the j-th position 
from the right (1 < j < IgN), and define e(i,j) to be the excess of l's over O's in bit positions greater 
than j (i.e., the number of l's minus the number of O's in the relevant portion of i's representation). Also, 
let r denote the number of consecutive O's at the right end of i's representation. (Using the notation r to 
represent a string of r O's, note that with i of the form X10 r , i — 1 is XOV, and —i is XlO r , where X is the 
bitwise complement of X.) Starting from the definitions of S(i, N) and f(i, k) in Equations ^ and we see 
that 



2 i-l 



2 i-i 

-l,fc) =0 
- 1, *) = 1 

.,,>>, K-hj) -Khj) (■ , . n 

- J + 1) + 2 e ^ ~ ,J ~ ' 



IgN 

S(i,N) = ^i(l-26(i-l,fc))mod2 fc 

k=l 
IgN k 

= 26(i- 1, fc)), i) 

k=l 3=1 
ig N lg JV 

= E !>(<(! -2&(i-l,*)),i) 

j=l k=j 
lg JV lg JV 

= EE 2 - 1 

j=l k=j 

r K-i,j) + b(%,j) 



b(i,j) itb(i 
b(-i,j) if b(i 



i=i 



{IgN 
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]T0 + 2''(lgiV 

3=1 



lg N - j + 1 + 



lgJV 

r)+ E 2J 

j=r+2 
lgJV lgJV 

= T(lgN-r) + \gN 2 '" 2 - E ^' 2J " 2 + E 23 

j=r+2 j=r+2 j=r+2 

IgN 

-n+ x: ^ 

j=r+2 
r+1 

4=1 



e(i-l,j-l) if6(z,i)=0 
-e(i-l,i-l) if 6(*,j) = 1 



_ 2 / e(i,i) if 6(i,i)=0 
-e(z,j) if i) = 1 



e(i,j) if6(i,j) = 
-e(i,j) ]fb(i,j) = l 



e(i,j) ifb(i,j)=0 
-e(i,j) if b(i,j) = 



IgN 



e(i,j) ifb(i,j) = 
-e(i,j) ifb(i,j) = l 



1 



-N - V~ 2 Wh 0) + j) + 2 r ~ 1 (e(i, 0)+r-l) + J2 2j 



3=1 



3 = 1 



-2 / e(i, j) if = 

-e(i,j) if = 1 



IgAT-l 



i(e(i,0) + iV-l) + ^ 2^ 



e(i,j) i£b(i,j) = 
-e(i,j) ifb(i,j) = l 



(6) 



From this expression, we can see that S(i, N) is maximized by setting pairs of bits greedily from the left end 
of i's representation, except for a slight variation when j becomes small, as in the theorem statement. (It is 
also easy to check that this maximum equals m(N) of Equation g.) ■ 



Now we proceed to analyze the maximum density in a row of the layout when it is required that each 
node has its connections in order of dimensions 1,2,... IgN. We define T(i,p, N) to be the number of wires 
crossing a cutline just to the right of the p-th terminal position on a node in column i — 1 for 1 < p < lg N 
(soT(i,lgN,N) = S(i,N)). 

Theorem 5 For N > 2, the maximum value of T(i,p, N) over all i and p is m(N) + 1 and is realized at an 
i for which S(i, N) = m{N). 

PROOF. We can express T(i,p,N) in terms of S(i,N) by using the notation defined at the beginning of 
the proof of Theorem || specifically, T(i,p, N) = S(i, N) + e(i — l,p). The term e(i — l,p) can be reexpressed 
in terms of e(i,p) based on the value of r defined above. For p > r, we have e[i — l,p) = e(i,p). For p < r, 
we have e(i — l,p) — e(i,p) + 2(r — p — 1). 

When r = 0, we know p > r, and we see that the strategy for choosing i described in Theorem || remains 
optimal, since the e(i,p) term is small compared to 2 J for most values of j in Equation^. With such an i, 
the largest e(i,p) we can achieve is 1 (if at least one of the pairs of bits under the strategy of Theorem [| is 
10 or 11). 

When r = 1, the situation is essentially the same as for r = 0, except that we must choose p > 1 to 
maximize e(i — l,p)- We still must choose an i that maximizes S(i, N), and e(i — l,p) will be at most 1. 

Choosing r > 2 contradicts choosing i to maximize S(i, N), and the deficit in the value of S(i, N) cannot 
be recouped through the term e(i — l,p). (For r — 2, e(i — l,p) cannot exceed e(i,p), while increasing values 
of r cause increasing deterioration in the value of S(i, N).) ■ 



Note that this result is not an idiosyncrasy of the particular ordering chosen for the terminals on each 
node. Rather, because of the symmetry in the layout, it is apparent than any ordering that is the same for 
all nodes leads to m(N) + 1 tracks; an ordering that reduces T(i,p,N) where it exceeds m(N) will make a 
corresponding increase from m(N) to m(N) + 1 in another position. 
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Figure 4: The top row of a gray code derived layout for N 



4 Alternative layouts 

Another frequently considered method of mapping hypercube nodes to a regular grid is to use a gray code 
derived layout. The numbering of nodes in the top row of a gray code layout for a 64- node hypercube is 
illustrated in Figure ^. (Here we have not required the terminals on each node to be in dimension order.) 
Ranade and Johnsson fll4f noted that the area and maximum wire length for the normal layout and the 
gray code layout are the same up to a constant factor. In fact, the arguments of Sections || and || can be 
extended to show that the maximum wire density and number of wiring tracks required per row is exactly 
the same for the gray code layout as for the normal layout, including a one track penalty when the nodes 
are identical. It is also easy to show that the total (horizontal) wire length per row is the same (in terms 
of the number of columns spanned). The maximum (horizontal) wire length in a row of the normal layout, 
however, is essentially half as large as for the gray code layout. 

The results of Harper || |i| , Nakano [|l3| , and Bezrukov et al. || show that the normal layout minimizes 
total wire length and intercolumn wire density, while a different layout minimizes maximum wire length. 
Bezrukov et al. also consider two new cost measures for embeddings of hypercubes into grids based on the 
frequent use of normal algorithms (2| . 
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